5 research outputs found

    NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads

    Full text link
    We focus on reconstructing high-fidelity radiance fields of human heads, capturing their animations over time, and synthesizing re-renderings from novel viewpoints at arbitrary time steps. To this end, we propose a new multi-view capture setup composed of 16 calibrated machine vision cameras that record time-synchronized images at 7.1 MP resolution and 73 frames per second. With our setup, we collect a new dataset of over 4700 high-resolution, high-framerate sequences of more than 220 human heads, from which we introduce a new human head reconstruction benchmark. The recorded sequences cover a wide range of facial dynamics, including head motions, natural expressions, emotions, and spoken language. In order to reconstruct high-fidelity human heads, we propose Dynamic Neural Radiance Fields using Hash Ensembles (NeRSemble). We represent scene dynamics by combining a deformation field and an ensemble of 3D multi-resolution hash encodings. The deformation field allows for precise modeling of simple scene movements, while the ensemble of hash encodings helps to represent complex dynamics. As a result, we obtain radiance field representations of human heads that capture motion over time and facilitate re-rendering of arbitrary novel viewpoints. In a series of experiments, we explore the design choices of our method and demonstrate that our approach outperforms state-of-the-art dynamic radiance field approaches by a significant margin.Comment: Siggraph 2023, Project Page: https://tobias-kirschstein.github.io/nersemble/ , Video: https://youtu.be/a-OAWqBzld

    Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing

    Full text link
    In this paper, we define and study a new Cloth2Body problem which has a goal of generating 3D human body meshes from a 2D clothing image. Unlike the existing human mesh recovery problem, Cloth2Body needs to address new and emerging challenges raised by the partial observation of the input and the high diversity of the output. Indeed, there are three specific challenges. First, how to locate and pose human bodies into the clothes. Second, how to effectively estimate body shapes out of various clothing types. Finally, how to generate diverse and plausible results from a 2D clothing image. To this end, we propose an end-to-end framework that can accurately estimate 3D body mesh parameterized by pose and shape from a 2D clothing image. Along this line, we first utilize Kinematics-aware Pose Estimation to estimate body pose parameters. 3D skeleton is employed as a proxy followed by an inverse kinematics module to boost the estimation accuracy. We additionally design an adaptive depth trick to align the re-projected 3D mesh better with 2D clothing image by disentangling the effects of object size and camera extrinsic. Next, we propose Physics-informed Shape Estimation to estimate body shape parameters. 3D shape parameters are predicted based on partial body measurements estimated from RGB image, which not only improves pixel-wise human-cloth alignment, but also enables flexible user editing. Finally, we design Evolution-based pose generation method, a skeleton transplanting method inspired by genetic algorithms to generate diverse reasonable poses during inference. As shown by experimental results on both synthetic and real-world data, the proposed framework achieves state-of-the-art performance and can effectively recover natural and diverse 3D body meshes from 2D images that align well with clothing.Comment: ICCV 2023 Poste

    Revisiting Event-based Video Frame Interpolation

    Full text link
    Dynamic vision sensors or event cameras provide rich complementary information for video frame interpolation. Existing state-of-the-art methods follow the paradigm of combining both synthesis-based and warping networks. However, few of those methods fully respect the intrinsic characteristics of events streams. Given that event cameras only encode intensity changes and polarity rather than color intensities, estimating optical flow from events is arguably more difficult than from RGB information. We therefore propose to incorporate RGB information in an event-guided optical flow refinement strategy. Moreover, in light of the quasi-continuous nature of the time signals provided by event cameras, we propose a divide-and-conquer strategy in which event-based intermediate frame synthesis happens incrementally in multiple simplified stages rather than in a single, long stage. Extensive experiments on both synthetic and real-world datasets show that these modifications lead to more reliable and realistic intermediate frame results than previous video frame interpolation methods. Our findings underline that a careful consideration of event characteristics such as high temporal density and elevated noise benefits interpolation accuracy.Comment: Accepted by IROS2023 Project Site: https://jiabenchen.github.io/revisit_even

    KGDet: Keypoint-Guided Fashion Detection

    No full text
    Locating and classifying clothes, usually referred to as clothing detection, is a fundamental task in fashion analysis. Motivated by the strong structural characteristics of clothes, we pursue a detection method enhanced by clothing keypoints, which is a compact and effective representation of structures. To incorporate the keypoint cues into clothing detection, we design a simple yet effective Keypoint-Guided clothing Detector, named KGDet. Such a detector can fully utilize information provided by keypoints with the following two aspects: i) integrating local features around keypoints to benefit both classification and regression; ii) generating accurate bounding boxes from keypoints. To effectively incorporate local features , two alternative modules are proposed. One is a multi-column keypoint-encoding-based feature aggregation module; the other is a keypoint-selection-based feature aggregation module. With either of the above modules as a bridge, a cascade strategy is introduced to refine detection performance progressively. Thanks to the keypoints, our KGDet obtains superior performance on the DeepFashion2 dataset and the FLD dataset with high efficiency
    corecore